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I.  INTRODUCTION 


Estimation  of  a  density  function  has  drawn  considerable  attention  in 
the  literature  over  the  last  two  decades.  Examples  of  practical  situations 
calling  for  the  estimation  of  a  density  can  be  found  in  the  works  of  several 
authors,  e.g. ,  Murthy  (1965),  Singh  (1977b),  Liang  and  Krishnaiah  (1985), 
among  others. 

In  one  of  the  pioneering  papers  on  the  problem  of  non- parametric 
estimation  of  a  continuous  density,  a  very  useful  and  rather  disappointing 
observation  was  made  by  Rosenblatt  (1956).  According  to  this  observation, 
any  reasonable  estimator  of  a  continuous  density  cannot  be  unbiased. 

Therefore,  any  attempt  to  Improve  upon  the  bias,  M.S.E. ,  or  rates  of  con¬ 
vergence  involved  In  the  asymptotics,  becomes  a  desirable  exercise.  The 
work  reported  here  Is  an  attempt  In  this  direction. 

While  treating  an  Inference  problem  relating  to  a  variate  Y,  a 
possible  approach  to  gain  In  precision  is  to  incorporate  a  concomittant 
random  variable  X  along  with  Y.  A  considerable  part  of  statistical 
literature  has  been  devoted  to  this  approach.  In  Section  1  we  have  proposed 
some  estimators  of  a  univariate  probability  density  function  f(y)  of  a 
r.  v.  Y  based  upon  a  set  of  observations  taken  from  a  bivariate  joint  density 
B(x,y)  of  Y  and  a  suitably  chosen  concomittant  r.  v.  X,  so  that 
f(y)  «  |  8(x,y)dx.  The  estimators  have  been  constructed  using  some  well 
known  heuristic  methods  employed  in  some  known  areas  of  statistics  but  never 
applied  In  the  area  of  density  estimation.  Although  the  asymptotic  properties 
and  rates  of  convergence  of  these  estimators  are  the  same  as  those  of  the 
usual  estimator  which  does  not  depend  on  the  data  on  X,  we  give  sufficient 
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conditions  on  8  and  the  marginal  densities  of  Y  and  X  under  which 
the  proposed  estimators  would  perform  better  than  the  usual  estimators  in 
the  sense  of  ] Bias {  and  the  MSE.  The  ideas  developed  here  can  easily 
be  extended  to  the  case  when  Y  and  X  are  both  multivariate. 

Different  methods  of  constructing  estimators,  apparently  none  better 
than  others  in  a  global  sense,  (Watson  (1969),  Wegman  (1972a) , (1972b)) , 
have  appeared  in  the  literature.  However,  we  have  adopted  the  most  widely 
used  Rosenblatt  (1956)  -  Parzen  (1962)  type  Kernel  method. 

In  Section  2,  we  have  also  looked  into  the  problem  of  estimating  a 
conditional  density  g(y|x)  of  a  r.v.  Y  given  another  r.v.  X  based  on 
a  set  of  paired  observations  on  (X,Y)  *  8(x,y)  and  a  set  of  additional 
observations  on  X  -  f(x).  This  problem  without  the  use  of  additional  data 
has  been  treated  by  Rosenblatt  (1969).  We  have  obtained  better  approximation 
for  the  variance  as  compared  to  Rosenblatt  (1969),  and  have  given  sufficient 
conditions  on  8  and  f  under  which  the  use  of  additional  data  on  X  gives 
smaller  absolute  error  and  variance  (and  hence  the  mean  squared  error)  than 
those  obtained  without  using  the  additional  data.  These  conditions  need  to 
be  examined  more  carefully  to  ease  their  accessibility  to  practical  problems. 
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1.  RATIO  TYPE  KERNEL  ESTIMATORS  OF  A  DENSITY 


In  order  to  estimate  a  continuous  density  f  of  a  random  variable  Y, 
the  design  proposed  Is  to  sample  from  a  bivariate  population  (X,Y)  -  e(x,y) 
where  X  -  \p(x)  is  a  suitably  chosen  concomitant  variable  such  that 
f(y)  =  Js(x,y)dx  and  ip(x)  *  Je(x,y)dy. 

We  first  treat  the  case  when  ^  is  a  known  density.  It  is  possible  to 
conceive  of  situations  where  this  may  be  the  case.  However,  some  of  the 
results  obtained  under  this  assumption  will  be  used  in  treating  the  other 
case  when  \i>  is  unknown. 


1.1  THE  CASE:  *  KNOWN 


Let  (Xj.Yj),  1  *  1,2,..., n  be  a  sample  e(x,y). 


Define 


-  &  j, K  (nr) 


where  0<h*h(n)+0  as  n  -*•  ®,  and  K  isa  Borel -measurable  bounded 
function  on  the  real  line  such  that 


|K(u)du  *  1,  |u  K(u)du  =  0,  ju*  K(u)  <  », 
[|K(u)|du  <  ®,  and  |uK(u)|  +0  as  ]u|  « 
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Throughout  the  remainder  of  this,  we  denote  jl<2(u)du  by  l^K)  and 
2”1  Ju2K(u)du  by  kg. 

We  propose  a  ratio  type  estimator  of  f(y).  For  y  eSf  =  {v:  f(v)  >  0}, 
def 1 ne 

fR(y)  a  fn(yto(x)/ij>n(x) 

/V 

where  x  is  a  suitably  chosen  point  from  S^*  .  The  estimator  fR  is  well 
defined  as  it  follows  from  Parzen  (1962)  that 

p[fn(y)>o]  -  i,  vyesf 

and 

P[in(x)*0]  1,  V  x  e 

as  n  -> 

Let 

e„  *  {fn(y)  -  Ef„(y)){Efn(y)rx 
and 

s„  -  (in(x)  -  E^n(x)HE^(x)}_1 


*  The  estimators  fn(y)  and  \j>n(x)  are  standrad  non-pa rametric  estimators 


of  the  respective  densities  f  and  ij;  based  on  a  technique  proposed  by 
Rosenblatt  (1956)  and  later  extended  by  Parzen  (1962)  to  the  now  familiar 
Kernel  method  of  estimation. 


Now,  for  the  rest  of  this  section,  we  will  assume  that 


ytCf!  SfTUy:  f(y)  Continuous} 
and 

y «  V 

For  simplicity,  we  will  not  write  the  argument  of  any  function,  l.e. ,  we 
will  write  <(>  to  denote  $(•)•  Momentarily,  we  will  drop  the  subscript 

A  '  A 

n  from  fn,  i|»n,  en  and  6n- 

In  terms  of  the  r.v.'s  e  and  6,  we  have 


fR  *  1)  •  Ef  •  (l+e)(l+5) 

and 

fjj  -  *2  •  (Ef)2(Ei)"2{(l+e)2(l+6)"2> 

Ignoring  the  terms  of  the  order  0(nh)-3^2  and  lower  (See  Remark  1.1  below) 
in  the  expressions  for  EfR  and  EfR,  we  can  write 


EfR  *  *  •  Ef  •  (Eir^l-ECefiJ+Etfi2)} 

Ef2  *  <|;2(Ef)2(Ei)"2{l-Ee2-K»E(e6)+3  E  62} 


(1.1) 


This  gives,  again  Ignoring  the  terms  of  the  order  O(nh)"'3'* 
an  approximation  for  the  variance 


and  lower 


a2(fR)  =  /(Ef)2(EJr2{E62-2E(eS)+Ee2} 
=  ip2(Ei|0~2o2(f ) 

+  ^2(Ef/Ej}2{E62-2  E(e6) } 


1.1  REMARK 

In  the  approximation  (1.1)  of  EfR,  the  error  of  approximation,  in 
absolute  value,  is  less  than  E| (l+e)(l+6)-l(-63) j  and 

E|ii|  63|  <  (E - ?>1/2{ E  (l+e)266}1/2 

1+5  (1+5)2 

where 


1+6  =  4»/Etpn 


=  w(say) 


follows,  as  n  •*  «,  a  normal  distribution  with  mean  1  and  variance 
o2[w]  *  0(nh)”*,  in  fact 


7  _i  *(x>  2 

cr  [w]  -  (nh)  L - y  1C (u)du 

J 


[see  Parzen  (1962)]. 


Consequently, 


=  1  |w-l |  >  t  log  nh//nR^ 

+  *|w-l|  <,  t  log  nh/t/nF^ 

_<  (1+t  log  nh/v/nTT)"2  P  (/nK  |w-l|  >  t  log  nh) 

+  (1-t  log  nh/v'nF)"^  P  (v'nF  |w-l|  <  t  log  nh) 

■  1  +  o(l) 

Further,  since  (1+e)  -  N(1  ,  O(nh)-1)  and  6  -  N(0  ,  O(Nh)"1),  it  follows 
that 

(e[(1+c)2  56])1/2  -  0(nh)'3/2 

Similarly,  it  can  be  shown  that  the  error  of  approximation  (1.1)  of 
E(fR)2  is  0(nh)"3/2. 


It  follows  from  Singh  (1977)  that 


Ef  =  f  +  f"k2h2+0(h2) 

Eip  =  \p  +  k2h^+0(h2) 

a2(f)  =  (nh2)  [EK2  h“1(Y1-l)  -E2K'h"1(Y1-l)  1 
-  (nh)-1  f  L2  (K) 

+  n"1  (f ,jvK2(v)dv-f2}+0(n“1h) 
cr2 (u»)  *  (nh)"1  tp  L2(k) 


+  n"1  {\p* 


fuk2(u)du-^)+0(n”1h) 


Further,  since 


Cov(f.J)  -  (nh)"2  }  Cov  ^K(h-1(Yry)),  K(h"1(Xi-x))^ 


=  n 


k(u)K(v)B(x+hu,y+hv)du  dv 


Qk(u)ip(x+hu)^  ^Jl<(v)f(y+hv)^j 


=  n"1{s-<pf}+0(n"1h) 


provided  first  order  partial  derivatives  of  B  are  continuous  at  (x,y) 


Now  we  will  prove  our  main  theorem: 


1.1  THEOREM 

For  v  y  e  Cf  and  V  x  e  such  that  the  first  order  partial 
derivatives  of  g  are  continuous  at  (x,y). 


EfR  *  f  +  h2  +  0(h2) 


^(fj  =  a2(f)  +  A  +  0(n-1h) 


where 


-2(ft|»)_1(g-ft|») 


and 


=  "'lf2  j-: 

+  f2  ^4»'juK2(u)du  -  +  (h^)"1 


Kz(u)du 


u, 

- 


a  (f)  as  given  in  (1.3). 


PROOF.  From  (1.3)  and  (1.4) 

Ee2  =  cr2(f)  (Ef  )”2 

=  f"2  |(nh)"1f|K2(u)du 

+  n-1(f' juK2(u)du-f2)  +  0(n_1h)j 

2 

E(6  )  has  a  similar  expression  with  f  being  replaced  by  i|i. 
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and 

E(efi)  =  (fiD)"1  +  0(n~1h) . 

Now  (1.2)  and  (1.3)  followed  by  the  expressions  given  for  Ee  ,  E6 
and  E(e5)  complete  the  proof  of  the  theorem. 

The  following  corollary  is  an  immediate  consequence  of  Theorem  1.1 


1.1  COROLLARY 


(e.g. 


uniform  or  standard  normal  kernel)  then 


a2(f)  »  (nh)-1f  j^-n'1  f2  +  0(n_1), 
and 

o2(fR)  -  (nh)’1  f+(f2/*)  j KZ-2n-1(l+  +  o(n_1) 


1.2  REMARK  (COMPARISON  OF 


WITH  THE  USUAL  ESTIMATOR 


*  p  2 

Under  the  similar  conditions,  Ef  =  f  +  h  k2f"  +  o(h  ).  Comparing 
this  with  the  EfR  in  (1.5)  we  see  that  |Bias(fR)|  <  | Bias (f ) |  if  and 
only  if  0  <  (\|>"/f")  <  2 i|>.  For  example,  with  f(t)  =  x(i(t)  =  (2ir)‘ly^2exp(-t2/2 ) , 
this  condition  is  satisfied  if 

x2  <  1  +  2(y2-l)f(y)  for  |y]  >  1,  and  if  x2  >  2(l-y2)f(y)  for  \y\  <  1. 


Comparing  the  variances  of  fR  and  f,  we  see  that  a  (fR)  <  a2(f) 
if  and  only  if 


h"1  jx2  <  2  SqfL  + 


(1.7) 


Thus,  if  concomitant  variable  is  chosen  in  such  a  way  that  X  and  Y 
have  positive  dependence  (i.e.,  P[X^x,Y<y]  _>  P[X<x]  P[Y<y]),  all  we  need 
is  to  choose  x  and  K  such  that  h"1  jx2  <  i|>(x). 

If  ^2ip(x (y)-xp(x)^  _>  CQ(x,y)  >  0,  where  ip(x |y )  is  the  conditional  density 
of  X  at  X  =  x  given  Y  =  y,  then  we  can  always  satisfy  (1.7)  by  choosing 
x  and  K  to  make  h"1  |k2  CQ(x,y).  Since  the  choice  of  X  is  at  our 
will,  for  a  given  y  it  may  be  possible  to  include  a  concomitant  variable  X 
in  our  design  and  to  choose  x  such  that  2i|<(x|y)  >  ij». 


1.2  THEOREM  ASYMPTOTIC  NORMALITY 


If  h2  *  o(nh)“^,  then 

(nh)1/2  (fR-f)  -S.  N(0,  fW1)  |k2). 

PROOF. 

Since  E(e6)  =  0(n-1)  and  o2(6)  *  O(nh)”1,  we  write 
fR  =  (Ef){Ei)_1  *  [l‘+  e  -  «  +  Op(nh)-1]. 

Therefore , 

(nh)1/2  (f.-f)  =  (nh)1/2  t(Ef)(Eir**-f) 

a.8, 

+  (nh)1^2  (EfJfE^)"1  i|)  (e-5)  +  Op(nh)  2 

From  (1.3),  ^Ef  •  (E\|>)-1  •  <p  -  f^  *  0(h2),  the  first  time  on  the  right  hand 
side  of  (1.8)  Is  o(l).  Further,  since  (nh)^2  (f-Ef)  N(0  ,  f  |k2)  and 

(nh)1/r2(i|/-ti»)  -£■  N(0  ,  $  |k2)  by  Parzen  (1962),  and  Cov(c,«)  ■  0(n_1),  we 

conclude  that  the  second  term  of  the  rhs  of  (1.8)  is  asymptotically  normal 

with  mean  zero  and  variance  (f2)[(f)’*+(l>)’*]  |k2. 

The  proof  of  the  theorem  is  now  complete. 


1.3  REMARK 


In  computing  the  asymptotic  variance  of  (nh)*/2  fR  we  have  ignored 
the  terms  of  the  order  0(h)  and  lower*  and  hence  the  asymptotic  variance 
of  (nh)1^  fR  turns  out  to  be  larger  than  f  jx2,  the  asymptotic  variance 
of  (nh)1/2  f.  We  have,  however,  seen  through  the  proof  of  Theorem  1.1 
that  if  we  retain  the  terms  of  order  0(h)  in  computing  the  variance  of 
(nh)  '  fR,  then  there  exist  situations  where  fR  has  smaller  variance 

a 

and  MSE  compared  to  those  of  the  usual  estimator  f. 

1.2  THE  CASE  OF  UNKNOWN  tp. 

Since  the  choice  of  the  concomitant  variate  X  is  at  our  will,  we 
choose  here  that  concomitant  variate  X  which  is  extremely  cheap  to  measure 
compared  to  Y  variate  so  that  we  can  have  a  very  large  sample  on  X  with 
very  little  extra  budget.  For  example.  If  Y  is  some  biochemical  content 
in  a  plant  and  X  is  chosen  as  the  weight  of  the  plant,  the  above  condition 
is  satisfied. 

Let  8  denote  the  joint  pdf  of  (X,Y)  so  that  f(y)  *  |s(x,y)dx 
and  (x)  «  fe(x,y)dy.  Let  Z1 , . . .  ,Zn  be  na  additional  1.1. d.  observations 

*  3 

on  X,  Independent  of  the  paired  data  (Xj, Yx) . .  ,(Xn,Yn)  -  Ui  d  according 
to  6.  We  take  na  large  enough  so  that  (naha)_1  *  o ( n” 1 )  where  hfl  *  h(na) 
Define 


Our  proposed  estimator  of  f(y)  is 


fR(y)  *  •  *(x) 

R  *(x) 

where  y  e  and  x  e  S^.  For  the  sake  of  simplicity,  we  will  again  not 
display  the  arguments  in  functions  like  fR(y),  f(y),  etc. 

Since  Ei|»  *  ^  +  0(hfl2),  it  follows  from  subsection  1.1  that 

EfR(y)  =  E •  E(i)  =  f  +  0(h2). 

R  \*(x)/ 

Now  we  examine  the  variance  of  fR.  Since  for  independent  random  variables 
M  and  V, 

o2(WV)  -  EW2*EV2  -  E2W»E2V 
-  o2(W)a2(V)  +  E2(W)a2(V)  +  E2(V)o2(W), 

a 

we  can  write  with  fR  as  given  in  subsection  1.1, 

c2(fR)  -  ;({)*;,  * E 2(|)°2(*) +  e2<*> 

-  c2(fR)  ilk  *  (EfJ2  ilk  *  •  o2(fR) 

♦  *  * 

-  0  ((nhr'th^r1)  +0(naha)-‘  +(l+0(hf)^  02(fR) 


°2(fR)  *  O(nh)'1  ,  o2(J)  -  0(naha)-1  and  EfR  *  f  +  0{h2) 
Thus,  since  ( naha)_1  3  ofn”1),  we  have 

a2(fR)  =  a2(fR)  +  o(n_1)  =  a2(f)  +  A  +  o(n_1) 


where  A  is  as  given  in  Theorem  1.1.  Therefore,  the  conclusion  of  Remark 
1.1  continues  to  hold. 

With  regard  to  the  asymptotic  distribution  of  fR,  we  note  that 
i  3  *  +  °p(naha)"1/2  3  ♦  +  op(n‘1/2). 

Hence 

fR  «  fR  +  Opfn”1^ 
and 

(nh)  1/2(fR  -  f)  -S*  (nh)1/2(fR  -  f)  N^O  ,  f2{f,1+*“1}Jl(2^ 

from  Theorem  1.2 


1.4  REMARK  REGRESSION  TYPE  DENSITY  ESTIMATO RS 


We  propose  a  linear  regression  type  density  estimator  of  f  as 


<*2(f1r)  *  a2  (^(y)) 


+  b202 


(u>(x))  -  2b  Cov  ^f(y)  ,  *(x)^ 


where  a2(f)  ,  a2U)  and  Cov(f  ,  \|i)  are  as  given  in  (1.3)  and  (1.4). 


Thus 


if  and  only  if 


!(flr)  <  °Z  (*(*)) 


b2  <  2b  gov  (f(yj  ,  j(xj) 
a  (i|/(x)) 

=  2b  <g  - 

h-1i|/|K2  -  if>2 


2.  ESTIMATION  OF  A  CONDITIONAL  DENSITY 

Let  g(yjx)  =  8(x,y)/f(x)  be  the  conditional  density  of  Y | X  *  x, 

where  the  couple  (X,Y)  ~  B(x,y)  ,  X  -  f(x)  *  je(x,y)dy  and  f(x)  >  o. 

Rosenblatt  (1969)  treats  the  problem  of  estimating  g  on  the  basis 

of  a  random  sample  (X1,Y1) ,. .. ,(Xn  ,Yn  )  from  the  joint  distribution  of 

c  c 

(X,Y).  We  are  also  going  to  estimate  g  but  under  a  data  set  up  which  is 
slightly  more  general .  In  addition  to  n  paired  observations  (X_. ,YJ's 

w  11 

we  also  have  additional  data  on  X,  i.e. ,  a  sample  from  the  univariate 
distribution  of  X, 


Ui,  U< 


Set 


N  =  n  +  n. 


Let  h(t)  be  a  positive  function  such  that 


h(t)  +  0  and  th^(t)  -*■  » 

as  t  -*■  «.  Set 

hc  *  h(nc)> 
h  -  h(nc+n4) 

and  note  that,  as  n  4-  «., 

c 

hc  +  0  ,  h  +  0  ,  nch^  •  and  Nh  -*■  ». 

Further,  let  B(u,v)  be  a  Borel  measurable  bounded  function  defined 
2 

on  R  such  that  as 

1 1  (u,v)|  |  -  -  ,  | !  (u,v) |  (  |B(u,v)|  •+■  o. 

We  also  assume  that 


|  |B(u,v)|du  dv  < 


» 


Also,  let  K  be  Bore 1 -measurable  bounded  function  defined  on  the  real  line 
such  that 


11m 
u  « 


|uK(u) |  =  o. 


and 


||K(u)|du  <  «  ,  |<(u)du  *  1  ,  juK(u)  ■  o 
ju^K(u)du  <  ®. 


Having  chosen  the  weight  functions  B  and  K  and  the  sequence  of  bandwldths 
{h(n)>,  we  propose  the  following  estimator  for  g(y|x)  at  a  point  of 
continuity  (x,y)  of  e(x »y)  such  that  f(x)  >  o. 

Def 1 ne 


gAs^lx)  a  Bnc(x,y)  /  fN(x) 


with 


2.1  REMARK 


It  follows  from  Parzen  (1962),  that  If  f(x)  >  o,  then 
P[fN(x)  >  0]  -*•  1  and  P[fn  (x)  >  o]  -*■  1.  Therefore,  g^s  Is  well-defined 
In  probability. 

2.2  REMARK 

A 

When  there  Is  no  additional  data,  l.e.,  the  case  when  na  *  o,  gAS 
reduces  to  the  estimator  studied  by  Rosenblatt  (1969); 

g(y|x)  -  L  (x,y)  /  fn  (x) 

"c  c 

where  n 

?„M  ■  (nchc)-'  K  (h-^Xj-x)) 

It  Is  well  known,  (e.g.,  Rosenblatt  (1956)  and  Cacoullos  (1966)),  that 
fN(x)  as  an  estimator  of  f(x)  and  in  (x,y)  as  that  of  8(x,y)  are 
consistent  In  quadratic  mean.  Intuitively,  we  expect  gAS  to  estimate  g 
consistently.  We  prove  this  and  other  results  In  the  remainder  of  this 
section. 

As  before,  for  the  remainder  of  Section  2,  we  will  not  display  the 
arguments  In  the  functions  defined  above. 
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2.1  ASYMPTOTIC  APPROXIMATIONS  FOR  BIAS.  VARIANCE.  AND  THE  DISTRIBUTION 

ofJasMxI- 


In  this  section  we  show  that  gAS  as  an  estimator  of  g  is  asymptotically 
unbiased,  consistent  in  quadratic  mean  and  asymptotically  normal  just  like  the 
usual  estimators  of  g(y|x)  proposed  by  Rosenblatt  (1969),  which  are  based 
on  only  paired  observations.  Approximation  for  the  bias  and  variance  obtained 

A 

here  for  gAS,  specialized  to  the  Rosenblatt's  case  (i.e.,  when  n#  *  0), 
are  better  than  those  noted  in  Rosenblatt  (1969).  We  further  give  sufficient 
conditions  on  @  and  f  under  which  the  absolute  bias  and  variance  of 

A  A 

gAS  are  smaller  than  those  for  g  obtained  by  Rosenblatt. 

Although  we  have  investigated  the  asymptotic  properties  with  nfl  ->■  », 
we  have  observed  (though  not  reported  here),  through  Monte-Carlo  simulation 

A 

that  for  nc  fixed  the  estimators  9^s^lx^  proposed  here  have  in  some 
cases  smaller  mean  squared  error  than  the  usual  estimators. 

It  is  well  known  (e.g.,  Singh  (1977)),  that  if  f",  the  second  derivative 
of  f,  is  continuous  at  x,  then  with  lj(x)  a  f"(x)ju2K(u)du  /  2  and 
L2(K)  =  fk2(u)du,  we  have 


EfN  -  f  +  h2lx  +  o(h2) 

E  f„c  -  f  +  h|lt  ♦  o(h|) 

o2(f„)  -  (Nil)"1  f  Lj(K)  +  o(Nh)"1 

oZ(fn  )  *  (nc»>c)_1  f  4<K)  +  o(nchc)"1, 
c 


and  with 


l2(x,y)  *  4f^  |pB(u,v)  ♦  iiMyl  |pB(u,v)du  dv 
and 

L2(b)  *  ||B2(u,v)du  dv. 

Choosing  B  In  such  a  way  that  JjuB(u,v)du  dv  =  0 

*  jjvB(u,v)du  dv,  we  obtain  from  Rosenblatt  (1969)  and  the  techniques  used 
In  Theorem  1  of  Parzen  (1962)  that 

Een  *  8  +  hc]2  + 

0 

<J2(Bn  )  »  (nch2)_1BL2(B)  +  (n^)”1  ||  •  JjuB2(u,v)du  dv 
+  ||  •  j|vB2(u,v)du  dv  +  o(n(;hc)"1 

s  (nchc^"lBL2^B^  +  °(nchc)"1 

For  the  rest  of  this  section,  put  yj(x)  *  l^x)  /  f(x)  and 
Y2(x,y)  3  (l2(x,y)  /  B(x ^y)  -  lj(x)  /  f(x)>.  As  with  others,  the  functions 
Yj  and  y2  will  be  displayed  without  their  arguments. 

Let 

A  A  A  1 

*  ■  <v  -  v(£V 

c  c  c 
and 

« -  (fN  -  EVtEf,,)-1 


Then,  In  terms  of  e  and  5,  we  have 


gA$  =  (EBn  )(EfN)_1  {(1  +  0(1  +  O"1}.  (2.2) 

c 

It  Is  well  known  that  e  and  6  are  0p(nchc)”^2  and  Op(Nh)-1^ 

respectively.  Further,  these  are  asymptotically  normal  random  variables 

2 

with  mean  zero  and  with  their  variances  tending  towards  zero  as  nchc  -*■  « 
in  case  of  e  and  as  Nh  -*-  «  in  case  of  6.  Therefore,  it  follows  that 
gAs  »  (EBn  KEf^"1  U  +  e  -  6  -  £5  +  52}  +  0  (nch)"3/2.  (2.3) 

Further  in  view  of  the  comments  made  in  Remark  1.1  of  Section  1,  it 
follows  that 


EgAS  *  (E8n  )(EfN)"!  U  -  E(e6)  +  E(52)}  +  0(nch)"3/2 
c 

and 

E(gA$)  •  (EBn  )2(EfN)"2  (1  +  E52  -  4E(eS)  +  3E52}  +  0(nch)"3/2 
c 


(2.4) 


With  the  above  observations,  we  are  now  able  to  prove  asymptotic 
unbiasedness,  quadratic  mean  consistency,  and  the  asymptotic  normality  of 

A 

gA$.  Throughout  the  remainder  of  this  section,  we  assume  that  h£  *  h(nc) 

Is  such  that  xn  =  hc/h  -*•  x  <  »  and  K  is  such  that 

K(x  u)  -*•  K(xu)  a.e. ,  in  u  as  n  »  (this  is  assured  if  K  Is  continuous 
a.e.).  To  prove  our  main  results,  we  make  use  of  the  following  lemma. 
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2.1  LEMMA 


If  8  is  continuous  at  (x,y),  then 


where 


c°v(8n  ,  fN)  =  (Nh)‘1BLx(KB)  +  o(Nh)"1 
c 


(KB)  s  |  B(u,v)  K(Xu)du  dv. 


PROOF. 


Since  (Xj,Yj)  ,  j  ■  l,...,nc  are  i.i.d.  and  are  independent  of 


{U,  ,  j  *  l,...na>. 


Cov(sn  i  f^j)  *  (nch^Nh)  ^  J  ^  Cov  (b(‘  ^  •  K 


(Nh)-1  [A  -  A*  } 


where 


nc  "c 


%  -  c  O 


% a  hc2  E  b(“ht  *  “Hr)  •  EK  (~r)’ 


Now  consider  first  Afl  .  We  can  write 

c 


|An  -  eLx(KB)j  <  y^x.y.nj.)  +  y2(x,y,nc) 


where 


Y1(x,y,nc)  «  | ||  ^e(x  -  hcu  ,  y  -  hcv)  -  8(x,y)^  K(xcu)B(u,v)du  dv| 
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Y2(x,y,nc)  ■  s(x,y)  |||{K(An  u)  -  K(u)}  B(u,v)du  dv| 


Since  K  is  bounded,  it  follows  from  Cacoullos  (1966)  that  y1  =  o(l), 

and  since  K(x  u)  -*■  K(xu),  by  dominated  convergence  theorem,  y2  1S  also  o(l). 
nc  L 

Hence,  Ap  =  sLx(KB)  +  o(l). 
c 

Further,  from  Cacoullos  (1966), 


EB  (^1  ,  =  h2  e(x,y)  +  o(h2) 


&)• 


hf(x)  +  o(h). 


Therefore,  A^  *  hsf  +  o(h). 
c 


The  proof  of  the  lemma  is  now  complete. 


2.1  THEOREM  ASYMPTOTIC  UNBIASEDNESS 


If  the  second  order  partial  derivatives  of  e  are  continuous  at 


(x,y),  then 


E  (gAS(y|x)  -  g(y|x)^  =  g(y|x)[h2Y2(x,y)  +  (h*  -  h2)  Yl(x) 

+  (Nh)”1  a2(K)  -  Lx(BK)>  (2.6) 


+  o(max  {h2  ,  (Nh)-1})]. 


It  follows  from  (2.0)  and  Lenina  2.1  that 


E(e  •  5)  =  (E0n  Cov(6n  ,  fN) 

c  c 

*  (Nh)’1  f (x)  _1  LX(KB)  +  o(Nh)"1 

This  result  accompanied  by  (2.4)  and  (2.0)  gives 

E9AS  ■  St  (l  ♦  +  o(h|))  (l  +  VL  +  o(h2)) 

which  finally  gives  (2.6). 


2.2  REMARK 


Notice  that  If  (Nh)"1  *  o(h2)  ,  then  the  bias  in  Theorem  2.1 

c 


by 


(gAS(y|x)  -  g(yjx)^  =  g(yjx) 


r 


l2(x,y)h^  1 X(x)h 

s(x  ,y) 


7(x7yT  l  +  o(hc)* 


L. 


The  right  hand  side  of  this  equation  with  na  *  o  reduces  to  what 
(1969)  has  noted  for  the  bias  of  the  estimator  g.  Writing  (2.6)' 

E(WyW  -  9(y|x)^  *  g(y|x)  h2  Y2(x,y) 

+  g(y| x)(h|  -  h2)  Yj(x)  +  o(hJ 


(2.7) 

is  gi\en 

(2.6)' 

Rosenblatt 

as 

(2.6)" 

) 


we  see  that  the  first  term  on  the  right  hand  side  of  (2.6)"  is  the  bias  of 

A 

g  with  no  additional  data. 


Thus,  we  conclude  the  following  corollary: 


2.1  COROLLARY 


Let  (Nh)-1  *  o(h2).  Under  the  hypothesis  of  Theorem  2.1, 

v 


I  bias  of  gAS(y|x)|  <  [bias  of  g(y|x)[ 
if  and  only  if  Y2(x,y)  and  y^(x)  are  of  opposite  signs  and 


i- jjzj!Yi(x)|  <2|Y2(x.y)| 


2.2  THEOREM  VARIANCE  OF 


If  8  is  continuous  at  (x,y),  then 


a  g(ylxJ  (ffrj)"1  C(Nh)’1g(y|x)  {L2(K)-2LX(KB)> 
+  (nch?)_1  +  °(max  ^Nh)_1  »  (^c)'1)). 


PROOF. 


It  follows  from  (2.4)  that 


oZ  (gAS(y|x)^)  *  (EB  )2(EfH)’2  (E52  -  2Ee5  +  Ee2}  +  0(nch) 
In  view  of  (2.0),  (2.1),  and  (2.7),  the  right  hand  side  is 

g2(y|x)  [(Nh)-1  ^f(x)^  “^(K)  -  2LX(KB)}  +  o(Nh)"1 

+  (n^2)*1  ^8(x  »y)j  ~l  L2(B)  +  o(nchc)“1j 

which  is  the  right  hand  side  of  (2.8). 

This  completes  the  proof  of  the  theorem. 


2.3  REMARK 


The  estimator  proposed  by  Rosenblatt  (1969),  which  is  only  based  on  a 
set  of  paired  observation,  coincides  with  our  estimator  in  the  case  n  *  i 

9 

a 

However,  his  approximation  to  the  variance  of  g(y|x)  is 
(nch 2)'1  g(y | x)  L2(B>  /  f(x)  +  otc^h)'1 
which  Is  strictly  larger  than  the  approximation  obtained  by  us.  For  the 

A 

case  na  =  0,  our  approximation  for  the  variance  of  g(y|x)  is 

[(nchc)_1  9<y|x)  L2(S)  -  (nch)-1  g2(y|x)  L2(K)j  /  f(x)  +  o(nchc)-1 
The  following  corollary  is  a  consequence  of  Theorem  2.1  and  2.2. 


Under  the  conditions  of  Theorem  2, 

use  [.y 

-  g2  h2  -  (f)'!  ij  h2>] 

+  (Nh)”1(f)"1  {L2(K)  -  2LX(KB)1 

♦  oyijr^er1  l2(B) 

+  o(max  {(Nh)'1  ,  (n  hj"1}) 


2.4  REMARK 


If  o  ^maxUNh)*1  ,  (nchc)-1>y  Is  Ignored, 

then 

mseW  “  «i + “2 

where 

W1  “  g2[hc  y2  + 
and 

“2  ■  the  -  h2>  [t"c  '  h2>  A  *  2hc  n  •  v2J 

When  na  =  0,  the  case  of  no  additional  data,  -  0  and  Wj  is 
the  MSE  of  g(y|x),  as  is  also  noted  by  Rosenblatt  (1969).  Thus,  we 
have  the  following  corollary. 

2.2  COROLLARY 

If  o^max{(Nh)”1  ,  (nchc)~*}y  is  Ignored,  then  under  the  hypothesis 
of  Theorem  2.1 


if  Yj  and  y2  are  of  opposite  signs,  and 

(1  -  h2/h2)  |yj(x)|  <  2|Y2(x,y)l.  (2-10) 

The  conditions  stated  In  the  corollary  2.2,  under  which  one  would  recommend 
the  use  of  additional  data,  are  not  of  practical  utility.  They  need  to  be 

A 

examined  more  critically.  Our  conjecture  Is  that  g^  will  not  perform 
better  than  g  in  the  case  of  strongly  dependent  variables  X  and  Y. 


2.3  THEOREM  ASYMPTOTIC  NORMALITY  OF  g^ 

If  (*>*)1/2  ■  0(h2c)  «"d  -  o(l). 

then 

(nch|)"1  (»AS<ylxJ  -  *MX>)  ~  *(°  .  9(y|x)  («*))  -y)  (2. 

PROOF. 

From  our  foregoing  analysis,  it  follows  that 

6  *  0p(Nh)'1/2  ,  cS  »  OpCNh)*1  and  52  *  Op(Nh)_1 . 

Therefore,  from  (2.3),  we  can  write 

( H^)1/2  (iAS  -  9)  *  ("ct'c)1/2  «  B  *  (EfN>_1  *  9> 

C 

+  •  EiB(.  •  (EV*1  •  e  *  op(l) 

In  view  of  (2.0)  and  (2.1),  the  first  term  of  the  right  hand  side  of 
(2.12)  is  o(l).  Further,  since  from  Cacoullos  (1966), 

<"ch?)1/2  (\  -  E(V  X  N  (°  •  bL2(b>))- 

the  second  term  on  the  right  hand  side  of  (2.12)  is  asymptotically  normal 
with  mean  zero  and  variance  g  •  (f )”^Lg(K) .  The  proof  the  theorem  is 
now  complete. 
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